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Abstract 



Several analog-to-digital conversion methods for bandlimited sig- 
t/> \ nals used in applications, such as SA quantization schemes, employ 

coarse quantization coupled with oversampling. The standard math- 
ematical model for the error accrued from such methods measures 
the performance of a given scheme by the rate at which the asse- 
ts. ' ciated reconstruction error decays as a function of the oversampling 

ratio A. It was recently shown that exponential accuracy of the form 
^ , 0(2~ aX ) can be achieved by appropriate one-bit Sigma-Delta modula- 

tion schemes. However, the best known achievable rate constants a in 
• this setting differ significantly from the general information theoretic 

. lower bound. In this paper, we provide the first lower bound specific 

to coarse quantization, thus narrowing the gap between existing up- 
per and lower bounds. In particular, our results imply a quantitative 
^ . correspondence between the maximal signal amplitude and the best 

^ | possible error decay rate. Our method draws from the theory of large 

deviations. 



1 Introduction 

Many signals of practical engineering interest are naturally produced in ana- 
log form; at the same time, it is becoming more efficient and robust to store 
and transmit signals in digital form. Therefore, the study of accurate and 
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tractable methods for analog-to-digital (A/D) conversion, or the approxima- 
tion of real- valued signals using a finite alphabet, is of great importance in 
modern signal processing. 

In the setting of A/D conversion, the signal of interest x(t) is often 
modeled as a bounded bandlimited function. According to the well-known 
Shannon-Nyquist sampling theorem, such functions are completely deter- 
mined by their values x n = x(j) sampled at frequency A greater than the 
signal bandwidth. The original signal can be reconstructed from these sam- 
ples using convolutional decoding of the form x(t) = v ^2 n& z x n9(t ~ f )• F° r 
exact equality, the Fourier transform of the function g G L°°(M) needs to 
approximate the characteristic function of the frequency support of x(t); in 
particular, g needs to have compact support. In that case, the reconstruc- 
tion formula represents an ideal low-pass filter. Conversion between analog 
and digital representations for x(t) may be achieved by replacing the input 
sequence (x n ) by a sequence (q n ) of quantized values chosen from a finite set 
such that the signal 



formed by replacing the x n 's with the Qn's yields a good approximation of x. 
In applications, one is often forced to approximate the ideal low-pass filter g 
by a filter tp satisfying additional constraints, as for example compact (time) 
support. 

In addition, one sometimes restricts attention to recovering the values Xj 
on the sampling grid only. Consequently, such a quantization scheme fixes 
a finite-length reconstruction filter <£> n , and approximate recovery is then 
obtained if 



In this paper we will focus on the continuous scenario (CQ), but we will allow 
for (almost) arbitrary reconstruction kernels <p. Similar techniques extend to 
corresponding results for the discrete scenario (T2J). 

Quantization schemes employed in practice 

In pulse code modulation, the sampling frequency A is close to the critical 
sampling frequency, and the quantized value q n is taken to be a truncated 
binary representation of the sample x n . To increase the accuracy of this 
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approximation, one takes longer binary expansions of each sample. In par- 
ticular, if m bits are allotted to each truncated binary expansion, then the 
distortion II X — X\ \ T.OO decreases like 0(2 m ). 

On the other hand, the set of admissible values for q n in oversampled 
coarse quantization methods is restricted to a fixed alphabet A of reason- 
ably small size, and more accurate approximations are obtained by increas- 
ing the sampling rate A. In the extreme case of one-bit quantization, one 
chooses the alphabet Ai = { — 1,+1}. For K-bit quantization, the q n are 
taken from the set Ak consisting of 2 K evenly spaced values in the closed 
interval [—1,1]. The number of bits spent per unit time interval in this set- 
ting is m = Alog 2 \ Ajc\ = XK. From the viewpoint of circuit engineering, 
oversampled coarse quantization is associated to low-cost analog hardware, 
because increasing the sampling rate is cheaper than refining the quantiza- 
tion. Consequently, oversampling data converters are often used for low to 
medium-bandwidth signals, such as audio signals [10] and, more recently, for 
wireless communication [5]. Further advantages of oversampled coarse quan- 
tization methods include a built-in redundancy and robustness against errors 
resulting from imperfections in the analog circuit implementation. This ro- 
bustness comes as a consequence of the more 'democratic' distribution of bit 
significance in the reconstruction formula, see [1] ; in the extreme case of one- 
bit quantization, the individual bits q n G { — 1,1} carry equal significance. 

Our work in relation to prior advances 

In this paper, we show that these advantages of coarse quantization come 
with the price of sub-optimal accuracy of the resulting convolutional approx- 
imation. It is well-known (see, for example, [8], [7]) that no quantization 
scheme spending m bits per Nyquist interval can beat the error decay of 
0{2~ m ) achieved by pulse code modulation. This optimal rate of decay is 
not possible for coarse quantization in the discrete setting fl2]), following the 
work of Calderbank and Daubechies [TJ. Until now, tighter lower bounds 
for coarse quantization are only available under the white noise hypothesis, 
where one assumes that the quantization error x n — q n is distributed like 
Gaussian white noise, and in conjunction with additional technical assump- 
tions (3j. In contrast, the lower bounds we shall provide hold for any K-bit 
quantization scheme, without any additional assumptions, and independent 
of the encoding algorithm used to generate the q n . 

As the main contribution of this paper, we provide an explicit lower bound 
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on the error decay achievable by K-bit quantization. Normalizing such that 
the g n 's are chosen from an evenly spaced alphabet with endpoints —1 and 1, 
and such that the bandlimited functions of interest are bounded in amplitude 
by fi < 1, we will show that the rate of decay of \\x\ — x\\oo is bounded below 
by 0(2 _Q?m ), where a = 1 — — and h is the unbiased binary 

entropy function h(u) = (1— u) log 2 (1 — u)+u log 2 u. In fact, the best known 
upper bounds for K-bit quantzation are also of the form 0(2~ rm ). Such a 
bound was first achieved via a construction by Giintiirk [7] of a a family 
of one-bit SA quantization schemes. These constructions were later refined 
by Deift, Giintiirk, Krahmer [2], yielding the rate constant r 0.102. As 
this rate constant is achieved only over input signals of maximal amplitude 
fj, < .05, this upper bound does not stand in contradiction to our lower 
bound, which implies in particular that the best possible rate constant tends 
to zero as /i — > 1. 

Organization of the paper 

After precisely setting up the problem and clarifying our notation in Section 
2, we summarize our results in Section 3. In Section 4, we recall important 
concepts and results from the theory of large deviations. In that section, we 
also recall results from the theory of Banach spaces which we use in the proof 
of our main theorem, which is presented in Section 5. 



2 Notation and setup 

Before continuing, let us survey the notation used in this paper. We use the 
Landau O-notation f(x) = 0(h(x)) (and f(x) = o{h(x))) to imply that for 
some M > (or any M > 0, respectively), there exists a real number uq 
such that |/(u)| < M\h(u)\ for all u > uq. Let S denote the Schwartz space 
of rapidly decreasing functions on K. For the Fourier transform, we use the 
normalization 

/oo 
x(t) exp {—2niuot)dt. (3) 
-oo 

We define the class -Bq(IR) of f2-bandlimited functions to be the space of 
real- valued continuous functions in L°°(1R) whose Fourier transforms (in the 
distributional sense) have support contained in [—Q/2,Q/2]. Henceforth, 
we will normalize Q = 1. The classical sampling theorem for bandlimited 
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functions states that if A > 1, then any function x in the class #i(R) and 
having bounded amplitude can be recovered from its samples {x(j )} n <=z as a 
weighted sum of translates of an averaging kernel g G via the formula 

*) = i5>(5M'-i)- « 

where g is any kernel whose Fourier transform satisfies 

' ■/ \ / 1, if |w| < 7T 

^ = 1 0, if M > AoTT ^ 

for some arbitrary Ao with A > A > 1. Note that with such a g, the recon- 
struction formula (jlj) describes an ideal low-pass filter. Note also that any 
such kernel g with finite frequency support must have infinite (time) support, 
according to the uncertainty principle. Such ideal filters with infinite-support 
are cumbersome to construct, and in practice are often approximated by ker- 
nels having finite support. In this case, the reconstruction formula (jl]) holds 
at most approximately. A priori it is not clear that this approximation always 
has a negative effect on the accuracy of the associated quantization schemes. 
For this reason, in the subsequent analysis we will not restrict the choice of 
the filter by more than a simple smoothness condition. We will use, however, 
the normalization arising naturally in the ideal case. There one has by (|SJ) 
that 

J g(t)dt = g(0) = 1; (6) 
we adapt this normalization for general kernels (p. 

A K-bit quantization scheme assigns, to each input function x and to each 
sampling rate A > Ao, a sequence of evenly-spaced from an alphabet Ak 
of size \Ak\ = 2 K in such a way that the approximation 

approaches x(t) as A — > oo. Consequently, the approximation quality result- 
ing from a particular sequence {g^jnez of quantized values together with a 
reconstruction kernel tp is commonly assessed by the reconstruction error, 

<,a(*) := x(t) - i £ qfo(t - j), q X n E A K (8) 
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and its supremum norm. We shall normalize the i^-bit quantization alphabet 
Ak so as to have extreme values +1 and —1. 

With this normalization on the alphabet in place and the kernel normal- 
ization OH]), the approximate reconstruction in (J7J) is essentially a weighted 
local average of the q x . Hence we can not expect good approximation for 
> 1. For this reason, we fix \x < 1 and work with the space Sf(R, /j.) 
defined to be the class of functions in B\ (R) with amplitude bounded by fi 
on the whole real line. Thus we will study 

£&(A):= sup inf HeflU-. (9) 

XGBl(K,/i) Qn^K 

3 Summary of results 

Our main result concerns a lower bound on the rate of decay for K-bit 
quantization of bandlimited functions in terms of the maximal amplitude 
ft: 

Theorem 3.1. Consider a K-bit quantization scheme associated to a recon- 
struction kernel ip G S, normalized so that j ip(t)dt = 1. If the optimal rate 
of decay for such a scheme satisfies -E^-(A) = 0(2~ aKX ), then 

„<,_*-.(!_*(!+£)), 

where h(p) = plog 2 p + (1 — p) log 2 (1 — p) is the binary entropy function. 

Theorem 13.11 represents a quantitative improvement over the general lower 
bound, which for A'-bit quantization reads 

E%. > 0(2~ KX ), (10) 

as well as over the corresponding strict inequality in the discrete case (as 
mentioned above). 

The lower bound provided in Theorem [XT] is most markedly improved over 
the previous lower bound fflO]) in the case of one-bit quantization, K — 1. 
In this case, the bound reduces to a < h(^-^). In Figure [3] we compare 
our lower bound with the best-known upper bounds from [2] in this setting. 
Observe that in the limit as \i — > 1, the upper and lower bounds both yield 
a = 0. For small /i, however, there is a considerable gap between the lower 
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bounds provided in this paper and the best-known constructive upper bounds 
in [2] . A possible explanation for that fact is that our lower bounds hold for 
arbitrary bit sequences, while there need not be a constructive procedure to 
find the optimal bit sequence from a signal. 




Figure 1: The rate constants a corresponding to upper and lower bounds for 
the error decay as a function of fi and for 1-bit quantization. 



Intuition behind Theorem 13.11 

That the performance of K-bit quantization schemes should depend on the 
maximal amplitude /i can be understood as follows. Among the 2 KN se- 
quences of length N comprised of elements q n £ Ak, most of the sums 
Y^/n=o Q n wm nave an average near zero. Now the values of the reconstructed 
function x = ^2\ n i <N q n f(t — j) are computed as a local average of the g n 's, 
hence most of the possible x are localized near zero as well. The larger /i, 
the larger the function values to be represented; the disproportion increases. 

Positive time sampling 

We note that in practice, the input signal x(t) is only accessible for posi- 
tive time t > 0, so one needs to reconstruct it from positive-time samples 
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x(j),n e N only. That is, it is more realistic to consider approximations of 
the form 

(*) = tEW-i)- ( n ) 



Then the quantity T (\) defined above can be interpreted as a 'calibration 
time' over which the approximation need not hold. Accordingly, one measures 
the reconstruction error through 



E^ + (X) := sup sup 

xeBooCR.M) t>T(A) 



n€N 



(12) 



This is the same scenario as considered in, [7], so the effect of using only 
positive-time samples can be controlled as in that work. One obtains the 
following corollary to Theorem 13.11 

Corollary 3.2. If the optimal rate of decay for E^ + (A) satisfies E^ + (\) = 
0(2~ aKX ), then 

a<l-K-i(l-h£±£)). (13) 



4 Background 

4.1 Inequalities from the theory of large deviations 

In order to make the intuition behind Theorem 13.11 rigorous, we need some 
results from the theory of large deviations for Bernoulli random variables. 
Recall that a Bernoulli random variable X with bias p takes values in the 
set {0, 1} with F(X = 1) = p. The relative entropy between two Bernoulli 
distributions with associated biases p and a is given by H = H(a,p) := 

alog 2 + (1 — a) log 2 (iEp) - l n ^ e particular case p = 1/2, the relative 
entropy function H(a, 1/2) simplifies to H(a, 1/2) = h(a) + 1 where h(a) = 
alog 2 (a) + (1 — a) log 2 (l — a) is the binary entropy function. For a sequence 
of independent Bernoulli random variables Bj with bias p, denote by S n : = 
YTj=\ Bj the sequence of their partial sums. A basic result in the theory of 
large deviations for Bernoulli sums reads 

Proposition 4.1. For p < a < 1, and for n G N, one has 

P n (a):=P{S n >na)<2- nH . (14) 
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Among any sum of independent and identically distributed (i.i.d) ran- 
dom variables Xj supported on [0, 1] and with expected value EX,- = p, the 
Bernoulli sum presents the slowest exponential rate of convergence towards 
zero for the probabilities of large deviation: 

Proposition 4.2. LetXi, X2, X n be independent and identically distributed 
random variables on [0, 1] with fi = E[X n ] = p. Then for p < a < 1, and for 
n G N, one has 



For more details on large deviations for Bernoulli sums (including a de- 
tailed discussion of Proposition 14. ip . we refer the reader to [5J. A more 
complete introduction to the theory of large deviations can be found in [I]. 
For a proof of Proposition 14.21 see [I] and [9]. 

4.2 Kolmogorov e- entropy 

We need a few concepts from the theory of Banach spaces (cf. [8]). Let Y be 
a Banach space and X cY a. compact subset. A set {fi}i£i, f% G Y, is called 
an e-net of X in Y if each x G X satisfies \\x — fi\\oo < £ for some i G /. 
Let N = N £ be the smallest number of functions fx, . . . , /jv G Y forming an 
£-net of X in Y. The quantity 



is the Kolmogorov e-entropy (or metric entropy) of X in Y. 

Recall that we use the notation Bi(I, /i) be refer to the class of functions 
x : I — > [— fi, fi] that are restrictions (to the interval / ) of functions in 
<6i(R, /i). This is a compact subspace of C(J) with respect to the norm 
|| ■ || oo. The Kolmogorov e-entropy of Bi (I, ft) in C(I) is shift invariant and 
can thus be denoted by H e (\I\). It is known [S] that the average Kolmogorov 
£-entropy (per unit interval) of this space, defined by 




(15) 



# £ :=log 2 iV; 



(16) 



H e := lira ^H E (\I\) 



(17) 



exists and has the asymptotic behavior 

H £ = (l + o(l)) log 2 - as e . 



(18) 
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Note that we may rewrite log(-) = log(-)(l + o(l)) as e — > 0, so that the 
asymptotic behavior of H £ is independent of \i. 

The average Kolmogorov e-entropy of the space 

BfOM) :=jB,(R,/i)n{/er I Va::/(t)e[^-<5,/i]}=^-|+B 1 (R,«J/2). 

(19) 

has the same asymptotic behavior as that of <Bi(R, //). To see this, we use 
that adding a constant does not change the e-entropy. 



5 Proof of Theorem 3.1 



We are now equipped with the necessary tools to prove our main result, 
Theorem 13.11 We proceed by contradiction; more specifically we will show 
that under the assumption that E%- < C2~ a for fixed constants a > 1 — 
K~ 1 {1 — fa(— j^)) and C > and all A > 1, one can construct e-nets for 
spaces of the type Bf(I, p) that violate the asymptotic bounds for the average 
Kolmogorov e-entropy given in Section 14.21 

5.1 An e-net for the whole space B\(I,fi) 

Let us restrict our attention to compact intervals of the form / = [—a, a]. 
Then, closely following [7], we introduce T (A) for all A > 1 to be the smallest 
number that satisfies 



p(s)ds < 2~ aKX , (20) 



T (A)- 



where p G L X (IR) is even symmetric on R, monotonically decreases on M + , 
and bounds \tp\ from above everywhere. This quantity can be interpreted as 
the margin that needs to be added to control the tail behavior of tp. For this 
reason, we consider the larger 'padded' interval 7 = [— a— T (A), a+T (A)], its 
dilation XI = [— A(a + T (A)), A(a + T (A))], and the truncated approximation 

A(*) = xE^(*~x)- (21) 

znxi 
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Restricting tot E I, this function is close to any possible extension of the 
form x \ = i ^ q x tp (t — ^j. Indeed, for n E 7L \ XI one has \t — j | > T (A), 



so that 



oo 



|^(t)-/A(*)|<^^(t-^)|<2 J p(s)ds<2-^ + \ (22) 

Z \ XT T (A)-i 

Recall that £&(A) = sup^^^ inf ? A e ^ x \\x - Sa|U»(r) =< C2- aKX is as- 
sumed. Then 

P - /aIU°°(/) < \\% ~ + Pa - /a|| (23) 

< C"2- aKA =: e. (24) 

That is, for this choice of e, the /a's form an e-net for the space Bi(I,fi). 
It is clear that as x varies in the set £>i(IR, fx), the resulting e-net Fa has 
cardinality at most 2 K ^ ZnXI ^. 



5.2 An e-net for the reduced space Bf(I,fj,) 

By our main assumption there is a fixed constant «o such that a > ao > 
1 — A' _1 (l — /i(-t^)). By continuity of /i, we may fix 5 > sufficiently small 
that a > l-K~ l {l-h{\ + ^)). For this choice of 5, we will now estimate 
the size of the £-net arising in the same way as Fx when x varies only over 
B((R, fx). Note that 5 may depend on fx but is independent of A. Hence we can 
we assume without loss of generality that A is large enough to ensure e < 5. 
We note that for all t one has x(t) > fx — 5, thus xx(t) > fx — 5 — e > fx — 25, 
and, by ( )22|) . fx(t) > fx — 35 for t E I. Consequently, 

F s x cF x n{f EB^fi) | Wei: f(t) > fx — 35}. (25) 
Let Gx = [AR] mXI , and consider the subset of this class given by 
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Now consider a random variable Q distributed according to a uniform prob- 
ability measure on G\. We observe that 

I -Fa | < \G{\ = P(Q e G{)\G X \ = P(Q e Gi)2 K|znA71 < P(Q g Gi)2 rA ^ (|/|+2To(A))1 . 

(27) 

We would now like to estimate P(Q G G{): 

1. Note that Q agrees in distribution with a sequence of identically dis- 
tributed independent variables Q n , n G ZD A/, which have support on 
[—1,1] and expectation E[Q n J = 0. Consequently, one obtains 

P(Q e G{) <P I Vj g Z n A/ : i ^ Q n </> > ^ - 35 

^ P fx E E g ^(Sr) >|znA/|0*-3<5) 

=P ^ c n Q n > |Z nA/|(/i- 35) , (28) 

\n6ZnAj / 

where c n = ± E j& n\i V • 

2. We would like to bound the coefficients c n . By assumption, we have 
ip G S. Therefore, we may apply the Poisson Summation Formula, 

x E v (Sr) = E = ?(°) + °( A_1 ) = 1 + °( A_1 )- ( 29 ) 

jez ^ ' kez 

From now on, we assume that a > To (A). This assumption makes sense 
as in the definition of the average Kolmogorov e-entropy, one lets |I| — > 
oo for each fixed A. Then the interval / := [—(a — T (A)), a — T (A)] is 
non-empty; it satisfies I C I C I. 

Now for n G XI, we have 



jeznA/ v 



<25 + 0(X~ 1 ) (30) 
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as a consequence of (I2"U|) . Furthermore, for n ^ XI, we use the crude 
estimate 



A 



< lipid + l + p(0) =:D (31) 



in conjunction with the bound 

\Z n (A/ \ A/)| < 4AT (A) + 4 =: N(X). 



(32) 



3. We now apply these bounds for the coefficients c n in (1281) . As I C I, 
and |/| = 0(A), we obtain 



P(Q e Gi) <P ^ Q„ > |Z n A/|(/i - 55 - (^(A- 1 )) - DN(X) 



<P ( g n > \Z n A/|(p - 55) - D'iV(A) J 
VznA/ / 



(33) 
(34) 



Rescaling the random variables Q n to yield independent and identically 
distributed random variables supported on [0, 1] with expectation equal 
to 1/2, we may apply Propositions I4.2l and 14. ll to bound the probability 
of such a large deviation from the mean: 

log 2 P(Q eG{)<- |ZnA/|/i(.5(l + p-5(5 + C"iV(A)) • \Z n A/| _1 ) j 

(35) 

< _ (A |/| _ i) ^(.5(1 + Ai -55) - (1 + Od/r 1 ))) - l) 

(36) 

Combining our estimate for P(Q G G s x ) with (l2"Tjh we obtain that as x 
varies in <Bf(R, p), the /a vary on a set i 7 ^ of cardinality at most 



N 2 A|/|(x-i+/ l (i+^))-(i+o(|/r 1 )) 



(37) 



As a consequence, for each A > 1, there are arbitrarily long intervals I 
such that, for each /, there is an e-net of Bf(I, p) with at most N elements. 
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5.3 Towards a contradiction 

We have found that the size of a e-net of Bf(I,ii) is bounded about by N, 
as given by equation (j3"7|) . Thus, we may bound the Kolmogorov e-entropy 
H e of the space Bf(I,iS) (see section 4.2) by 

Hs(\I\) < log 2 (A0 =\\I\(K-l + h(± + ^^)) (1 + 0(\in). (38) 

— H (\I\ 

As \I\ y oo, we may bound the average Kolmogorov e-entropy H £ = lirri|;-|_ > . 00 — ]jp 

by if e < \(k - 1 + /i(± + ^)). Note that this also gives a bound on 
the average Kolmogorov e-entropy for the larger space B\. Recalling that 
e = C'2~ aKX , or log 2 ^ = ai^A — log 2 C, and also recalling our hypothesis 
that a > a > 1 -K' 1 + K~ 1 h( 1+M ~ 5 ' 5 ), we arrive at the chain of inequalities 

log:, C ^ /. 1\ 1 - if- 1 + K-^i + _ 1^ a„ 



a -i^n iog2 ij — s n^ijg (39) 

This establishes a contradiction to fTT8|) . as A — > oo and consequently e — >• 
0. □ 



Remark The assumption that the kernel tp G 5 in our main theorem is 
stronger than necessary. We used this assumption only to apply the Poisson 
Summation Formula in the proof of Theorem l3.lt a weaker but more technical 
requirement for this formula to hold is that |<^(t)| + |<£>(t)| < C(l +t)~ ( - 1+s \ In 
particular, our proof also works for twice continuously differentiable kernels 
ip with compact support, a scenario resembling filters used in practice. 
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